Integration under the Schema Tuple Query Assumption
نویسنده
چکیده
Typically data integration systems have significant gaps of coverage over the global (or mediated) schema they purport to cover. Given this reality, users are interested in knowing exactly which part of their query is supported by the available data sources. This report introduces a set of assumptions which enable users to obtain intensional descriptions of the certain, uncertain and missing answers to their queries given the available data sources. The general assumption is that query and source descriptions are written as tuple relational queries which return only whole schema tuples as answers. More specifically, queries and source descriptions must be within an identified sub-class of these ‘schema tuple queries’ which is closed over syntactic query difference. Because this identified query class is decidable for satisfiability, query containment and equivalence are also decidable. Sidestepping the schema tuple query assumption, the identified query class is more expressive than conjunctive queries with negated subgoals. The ability to directly express members of the query class in standard SQL makes this work immediately applicable in a wide variety of contexts.
منابع مشابه
Reasoning in Data Integration Systems: why LAV and GAV are Siblings
Data integration consists in providing a uniform access to a set of data sources, through a unified representation of the data called global schema; a mapping specifies the relationship between the global schema and the sources. Integrity constraints (ICs) are expressed on the global schema to better represent the domain of interest; in general, ICs are not satisfied by the data at the sources....
متن کاملKnowledge Representation using Schema Tuple Queries
This paper introduces schema tuple queries and argues for their suitability in representing knowledge over standard relational databases. Schema tuple queries are queries that return only whole tuples of schema relations. In particular a subclass of the schema tuple queries is identified that is decidable for satisfiability and is closed over syntactic query difference. These properties enable ...
متن کاملTowards A Unified Framework For Schema Merging
Merging schemas to create a mediated view is a recurring problem in applications related to data interoperability. The task becomes particularly challenging when the schemas are highly heterogeneous and autonomous. Classical data integration systems rely on a mediated schema created by human experts through an intensive design process. Automatic generation of mediated schemas is still a goal to...
متن کاملNoSym: Non-Symbolic Databases for Data Decoupling
Under the Unique Name Assumption (UNA), users need to have shared agreements on signifiers to use in schema or data, e.g. to use “genre” and not “type” to refer to a movie’s category. Agreements are difficult in open environments such as datasets on the web, open data, and crowd-sourced databases, thus this assumption can be invalid. Schema matching and data integration can be limited in respon...
متن کاملData exchange: query answering for incomplete data sources
Data exchange is the problem of transforming data structured under a schema, called the source schema, into data structured under another schema, called the target schema. Existing work on data exchange considers settings where the source instance does not contain incomplete information. In this paper we study semantics and address algorithmic issues for data exchange settings where the source ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003